mlr3 bookOur book is the central entry point to mlr3 ecosystem
This tutorial follows the book closely
mlr3verseThe mlr3verse package contains all important packages of the mlr3 ecosystem
mlr3 PhilosophyObject-oriented programming
Tabular data
Unified tabular input and output data formats
Defensive programming and type safety
Light on dependencies
Separation of computation and presentation
R’s more recent paradigms for object-oriented programming
Instances of an R6 class are created by using $new()
In practice often replaced by sugar functions
R6 objects may have mutable states that are encapsulated in their fields
Can be accessed and modified through the dollar $ operator
Fields can also be ‘active bindings’, which perform additional computations when referenced or modified.
R6 objects have methods that are functions that are associated with the object
Methods change the internal state of the objects
Or retrieve information about the object
R6 objects are environments
Most mlr3 objects are created with sugar functions
Reduces the amount of code a user has to write
For example, lrn() creates a learner object without having to use $new()
R6 classes are stored in dictionaries
Associates keys with objects
<DictionaryTask> with 22 stored values
Keys: ames_housing, bike_sharing, boston_housing, breast_cancer,
california_housing, german_credit, ilpd, iris, kc_housing, moneyball,
mtcars, optdigits, penguins, penguins_simple, pima, ruspini, sonar,
spam, titanic, usarrests, wine, zoo
Use sugar functions to retrieve objects from dictionaries
mlr3 includes a few predefined tasks
Stored in the mlr_tasks dictionary
To get a task from the dictionary, use the tsk() function
Construct regression task with the as_task_regr() function
Keep only one feature
Keep only these rows
Help pages of functions can be queried with ?
Help page of the mtcars task you could use ?mlr_tasks_mtcars
$help() allows you to access the help page from any instance of that class
Unified interface to many popular ml algorithms
Access learners from the dictionary with lrn()
$feature_types: type of features the learner can handle$packages: packages required to be installed to use the learner$properties: properties of the learner e.g. the “missings” property$predict_types: types of prediction that the model can make$param_set: set of available hyperparameters# load mtcars task
tsk_mtcars = tsk("mtcars")
# load a regression tree
lrn_rpart = lrn("regr.rpart")
# pass the task to the learner via $train()
lrn_rpart$train(tsk_mtcars)
# inspect the trained model
lrn_rpart$modeln= 32
node), split, n, deviance, yval
* denotes terminal node
1) root 32 1126.04700 20.09062
2) cyl>=5 21 198.47240 16.64762
4) hp>=192.5 7 28.82857 13.41429 *
5) hp< 192.5 14 59.87214 18.26429 *
3) cyl< 5 11 203.38550 26.66364 *
Randomly split the given task into two disjoint sets
$train
[1] 3 5 7 8 9 10 11 14 15 16 17 22 23 24 25 26 27 28 29 31 32
$test
[1] 1 2 4 6 12 13 18 19 20 21 30
$validation
integer(0)
Train the learner on the training set
Predict from trained model
Returns a Prediction object
<PredictionRegr> for 11 observations:
row_ids truth response
1 21.0 15.33571
2 21.0 15.33571
4 21.4 15.33571
--- --- ---
20 33.9 25.01429
21 21.5 25.01429
30 19.7 25.01429
Get tabular form:
Affect how the learner is run. Represented as ParamSet object:
<ParamSet(10)>
id class lower upper nlevels default value
<char> <char> <num> <num> <num> <list> <list>
1: cp ParamDbl 0 1 Inf 0.01 [NULL]
2: keep_model ParamLgl NA NA 2 FALSE [NULL]
3: maxcompete ParamInt 0 Inf Inf 4 [NULL]
4: maxdepth ParamInt 1 30 30 30 [NULL]
5: maxsurrogate ParamInt 0 Inf Inf 5 [NULL]
6: minbucket ParamInt 1 Inf Inf <NoDefault[0]> [NULL]
7: minsplit ParamInt 1 Inf Inf 20 [NULL]
8: surrogatestyle ParamInt 0 1 2 0 [NULL]
9: usesurrogate ParamInt 0 2 3 2 [NULL]
10: xval ParamInt 0 Inf Inf 10 0
This defines the configuration space and contains the actual hyperparameter values.
Class of the parameter and the possible values
| Hyperparameter Class | Hyperparameter Type |
|---|---|
ParamDbl |
Real-valued (numeric) |
ParamInt |
Integer |
ParamFct |
Categorical (factor) |
ParamLgl |
Logical / Boolean |
ParamUty |
Untyped |
During construction
Updating after construction
Evaluating the models performance
Perhaps the most important step of the applied machine learning
Quantify the accuracy of the model’s predictions
We continue with the code from the previous slides
Quality of predictions is evaluated using measures
Access measures from the dictionary with msr()
Key: <key>
key label task_type predict_type
<char> <char> <char> <char>
1: ci Default CI <NA> response
2: ci.con_z Conservative-Z Interval <NA> response
3: ci.holdout Holdout Interval <NA> response
4: ci.ncv Nested CV Interval <NA> response
5: classif.fdr False Discovery Rate classif response
6: clust.wss Within Sum of Squares clust partition
7: regr.medse Median Squared Error regr response
Mean absolute error
\(f(y, \hat{y}) = | y - \hat{y} |\)
Score the predictions with the mean absolute error
Classification problems are ones in which a model predicts a discrete, categorical target
The interface is similar as possible to regression
set.seed(349)
# load and partition our task
tsk_penguins = tsk("penguins")
splits = partition(tsk_penguins)
# load decision tree and set hyperparameters
lrn_rpart = lrn("classif.rpart", cp = 0.2, maxdepth = 5)
# load accuracy measure
measure = msr("classif.acc")
# train learner
lrn_rpart$train(tsk_penguins, splits$train)
# make and score predictions
lrn_rpart$predict(tsk_penguins, splits$test)$score(measure)classif.acc
0.9473684
The sonar task is an example of a binary classification problem
In mlr3 terminology it has the “twoclass” property
tsk("penguins") is a multiclass problem as there are more than two species of penguins; it has the “multiclass” property
Predictions in classification are either "response" – predicting an observation’s class or "prob" – predicting a vector of probabilities of an observation belonging to each class
lrn_rpart = lrn("classif.rpart", predict_type = "prob")
lrn_rpart$train(tsk_penguins, splits$train)
prediction = lrn_rpart$predict(tsk_penguins, splits$test)
prediction<PredictionClassif> for 114 observations:
row_ids truth response prob.Adelie prob.Chinstrap prob.Gentoo
1 Adelie Adelie 0.97029703 0.02970297 0.0000000
2 Adelie Adelie 0.97029703 0.02970297 0.0000000
3 Adelie Adelie 0.97029703 0.02970297 0.0000000
--- --- --- --- --- ---
338 Chinstrap Chinstrap 0.04255319 0.93617021 0.0212766
339 Chinstrap Chinstrap 0.04255319 0.93617021 0.0212766
342 Chinstrap Chinstrap 0.04255319 0.93617021 0.0212766
To evaluate "response" predictions, you will need measures with predict_type = "response"
To evaluate probability predictions you will need predict_type = "prob"
The rows in a confusion matrix are the predicted class and the columns are the true class
All off-diagonal entries are incorrectly classified observations, and all diagonal entries are correctly classified
You can visualize the predicted class labels with autoplot.PredictionClassif()
The default response prediction type is the class with the highest predicted probability
task_credit = tsk("german_credit")
split = partition(task_credit)
lrn_rpart = lrn("classif.rpart", predict_type = "prob")
lrn_rpart$train(task_credit, split$train)
prediction = lrn_rpart$predict(task_credit, split$test)
prediction$score(msr("classif.acc"))classif.acc
0.7030303
truth
response good bad
good 187 71
bad 27 45
In binary classification, the positive class will be selected if the predicted class is greater than 50%, and the negative class otherwise
Useful to change threshold if class imbalance, different costs associated with classes, or preference to ‘over’-predict one class
prediction$set_threshold(0.2)
prediction$score(msrs(c("classif.tpr", "classif.ppv", "classif.fbeta"))) classif.tpr classif.ppv classif.fbeta
0.9579439 0.6788079 0.7945736
Resampling strategies repeatedly split all available data into multiple training and test sets
Access resampling strategy from the dictionary with rsmp()
Key: <key>
key label
<char> <char>
1: bootstrap Bootstrap
2: custom Custom Splits
3: custom_cv Custom Split Cross-Validation
4: cv Cross-Validation
5: holdout Holdout
6: insample Insample Resampling
7: loo Leave-One-Out
8: nested_cv Nested CV
9: paired_subsampling Paired Subsampling
10: repeated_cv Repeated Cross-Validation
11: subsampling Subsampling
Holdout method
3-fold CV
Subsampling with 3 repeats and 9/10 ratio
2-repeats 5-fold CV
resample() repeatedly fits a model on training sets, makes predictions on the corresponding test sets
Stores them in a ResampleResult object
tsk_penguins = tsk("penguins")
lrn_rpart = lrn("classif.rpart")
rr = resample(tsk_penguins, lrn_rpart, cv3, store_models = TRUE)
rr<ResampleResult> with 3 resampling iterations
task_id learner_id resampling_id iteration prediction_test warnings
penguins classif.rpart cv 1 <PredictionClassif> 0
penguins classif.rpart cv 2 <PredictionClassif> 0
penguins classif.rpart cv 3 <PredictionClassif> 0
errors
0
0
0
We can calculate the score for each iteration with $score()
$aggregate() returns the aggregated score across all resampling iterations
Prediction object for each resampling iteration
Can also be used for model inspection
n= 229
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 229 132 Adelie (0.423580786 0.240174672 0.336244541)
2) flipper_length< 206.5 146 51 Adelie (0.650684932 0.342465753 0.006849315)
4) bill_length< 43.05 95 4 Adelie (0.957894737 0.042105263 0.000000000) *
5) bill_length>=43.05 51 5 Chinstrap (0.078431373 0.901960784 0.019607843) *
3) flipper_length>=206.5 83 7 Gentoo (0.024096386 0.060240964 0.915662651)
6) island=Dream,Torgersen 7 2 Chinstrap (0.285714286 0.714285714 0.000000000) *
7) island=Biscoe 76 0 Gentoo (0.000000000 0.000000000 1.000000000) *
Compare multiple learners on a single task
Or multiple learners on multiple tasks
tasks = tsks(c("german_credit", "sonar"))
learners = lrns(c("classif.rpart", "classif.ranger",
"classif.featureless"), predict_type = "prob")
rsmp_cv5 = rsmp("cv", folds = 5)
design = benchmark_grid(tasks, learners, rsmp_cv5)
design task learner resampling
<char> <char> <char>
1: german_credit classif.rpart cv
2: german_credit classif.ranger cv
3: german_credit classif.featureless cv
4: sonar classif.rpart cv
5: sonar classif.ranger cv
6: sonar classif.featureless cv
Benchmark experiments are conducted with benchmark()
Runs resample() on each task and learner separately
Collects the results in a BenchmarkResult object
<BenchmarkResult> of 30 rows with 6 resampling runs
nr task_id learner_id resampling_id iters warnings errors
1 german_credit classif.rpart cv 5 0 0
2 german_credit classif.ranger cv 5 0 0
3 german_credit classif.featureless cv 5 0 0
4 sonar classif.rpart cv 5 0 0
5 sonar classif.ranger cv 5 0 0
6 sonar classif.featureless cv 5 0 0
$score() will return results over each fold of each learner/task/resampling combination
$aggregate() returns the aggregated score across all resampling iterations
task_id learner_id classif.ce
<char> <char> <num>
1: german_credit classif.rpart 0.2660000
2: german_credit classif.ranger 0.2400000
3: german_credit classif.featureless 0.3000000
4: sonar classif.rpart 0.2732869
5: sonar classif.ranger 0.1775842
6: sonar classif.featureless 0.5384437
Collection of multiple ResampleResult objects
<ResampleResult> with 5 resampling iterations
task_id learner_id resampling_id iteration prediction_test
german_credit classif.rpart cv 1 <PredictionClassif>
german_credit classif.rpart cv 2 <PredictionClassif>
german_credit classif.rpart cv 3 <PredictionClassif>
german_credit classif.rpart cv 4 <PredictionClassif>
german_credit classif.rpart cv 5 <PredictionClassif>
warnings errors
0 0
0 0
0 0
0 0
0 0
Convert to a data.table
Decide which hyperparameters to tune and what range to tune
id class lower upper nlevels
<char> <char> <num> <num> <num>
1: cachesize ParamDbl -Inf Inf Inf
2: class.weights ParamUty NA NA Inf
3: coef0 ParamDbl -Inf Inf Inf
4: cost ParamDbl 0 Inf Inf
5: cross ParamInt 0 Inf Inf
6: decision.values ParamLgl NA NA 2
7: degree ParamInt 1 Inf Inf
8: epsilon ParamDbl 0 Inf Inf
9: fitted ParamLgl NA NA 2
10: gamma ParamDbl 0 Inf Inf
11: kernel ParamFct NA NA 4
12: nu ParamDbl -Inf Inf Inf
to_tune() specifies the hyperparameter to tune and the range to tune over
learner = lrn("classif.svm",
type = "C-classification",
kernel = "radial",
cost = to_tune(1e-1, 1e5),
gamma = to_tune(1e-1, 1)
)
learner<LearnerClassifSVM:classif.svm>: Support Vector Machine
* Model: -
* Parameters: cost=<RangeTuneToken>, gamma=<RangeTuneToken>,
kernel=radial, type=C-classification
* Packages: mlr3, mlr3learners, e1071
* Predict Types: [response], prob
* Feature Types: logical, integer, numeric
* Properties: multiclass, twoclass
| Terminator | Function call and default parameters |
|---|---|
| Clock Time | trm("clock_time") |
| Number of Evaluations | trm("evals", n_evals = 100, k = 0) |
| Performance Level | trm("perf_reached", level = 0.1) |
| Run Time | trm("run_time", secs = 30) |
| Stagnation | trm("stagnation", iters = 10, threshold = 0) |
trm("combo") allows to combine multiple terminators
trm("none") is used by tuners that terminate on their own
| Terminator | Function call and default parameters |
|---|---|
| Combo | trm("combo", any = TRUE) |
| None | trm("none") |
Collects the information required to optimize a model
tsk_sonar = tsk("sonar")
instance = ti(
task = tsk_sonar,
learner = learner,
resampling = rsmp("cv", folds = 3),
measures = msr("classif.ce"),
terminator = trm("none")
)
instance<TuningInstanceBatchSingleCrit>
* State: Not optimized
* Objective: <ObjectiveTuningBatch:classif.svm_on_sonar>
* Search Space:
id class lower upper nlevels
<char> <char> <num> <num> <num>
1: cost ParamDbl 0.1 1e+05 Inf
2: gamma ParamDbl 0.1 1e+00 Inf
* Terminator: <TerminatorNone>
There are multiple Tuner classes in mlr3tuning, which implement different HPO (or more generally speaking black box optimization) algorithms
The trn() function is used to create a tuner
Basic algorithms
| Tuner | Function call | Package |
|---|---|---|
| Random Search | tnr("random_search") |
mlr3tuning |
| Grid Search | tnr("grid_search") |
mlr3tuning |
Adaptive algorithms learn from previously evaluated configurations
| Tuner | Function call | Package |
|---|---|---|
| CMA-ES | tnr("cmaes") |
adagio |
| Generalized Simulated Annealing | tnr("gensa") |
GenSA |
| Nonlinear Optimization | tnr("nloptr") |
nloptr |
| Iterated Racing | tnr("irace") |
irace |
More adaptive algorithms implemented in extension packages
| Tuner | Function call | Package |
|---|---|---|
| Hyperband | tnr("hyperband") |
mlr3hyperband |
| Model-based Optimization | tnr("mbo") |
mlr3mbo |
Grid search exhaustively evaluates every possible combination of given hyperparameter values
Control parameters can be set, as with learners these are accessible with $param_set
We can start the tuning process
Pass the constructed TuningInstanceBatchSingleCrit to the $optimize() method of the initialized Tuner
cost gamma learner_param_vals x_domain classif.ce
<num> <num> <list> <list> <num>
1: 25000.08 0.1 <list[4]> <list[2]> 0.2837129
The optimizer returns the best hyperparameter configuration and the corresponding performance
The result is also stored in instance$result
$learner_param_vals lists the optimal hyperparameters from tuning, as well as the values of any other hyperparameters that were set
$x_domain field is most useful in the context of hyperparameter transformations
To add this transformation to a hyperparameter we simply pass logscale = TRUE
The instance’s archive lists all evaluated hyperparameter configurations
Visualize the results as a surface plot with mlr3viz
We can use the best hyperparameter configuration to train a final model on the whole data
lrn_svm_tuned = lrn("classif.svm")
lrn_svm_tuned$param_set$values = instance$result_learner_param_vals
lrn_svm_tuned$train(tsk_sonar)$model
Call:
svm.default(x = data, y = task$truth(), type = "C-classification",
kernel = "radial", gamma = 0.1, cost = 25000.075, probability = (self$predict_type ==
"prob"))
Parameters:
SVM-Type: C-classification
SVM-Kernel: radial
cost: 25000.08
Number of Support Vectors: 205
mlr3tuning includes two helper methods to simplify the tuning
We use the same components as before
tuneCreates a tuning instance and calls $optimize()
lrn_rpart = lrn("classif.rpart",
minsplit = to_tune(2, 128, logscale = TRUE),
minbucket = to_tune(1, 64, logscale = TRUE),
cp = to_tune(1e-04, 1e-1, logscale = TRUE)
)
instance = ti(
task = tsk("pima"),
learner = lrn_rpart,
resampling = rsmp("cv", folds = 3),
measures = msr("classif.ce"),
terminator = trm("evals", n_evals = 100)
)
tuner = tnr("random_search", batch_size = 10)
tuner$optimize(instance)auto_tunerauto_tunerInherits from the Learner class and wraps all tuning components
Runs tune() when $train() is called
Then trains a model on the whole data with the optimal configuration
auto_tuner<AutoTuner:classif.svm.tuned>
* Model: -
* Parameters: list()
* Packages: mlr3, mlr3tuning, mlr3learners, e1071
* Predict Types: [response], prob
* Feature Types: logical, integer, numeric
* Properties: multiclass, twoclass
* Search Space:
id class lower upper nlevels
<char> <char> <num> <num> <num>
1: cost ParamDbl -11.51293 11.51293 Inf
2: gamma ParamDbl -11.51293 11.51293 Inf
AutoTunerThe inner resampling is a 4-fold CV and the outer resampling is a 3-fold CV
Pass an AutoTuner to resample() or benchmark() to start the nested resampling
AutoTunerThe estimated performance of a tuned model is reported as the aggregated performance of all outer resampling iterations
Optimal configurations across all outer folds
Full tuning archives
mlr3tuningspaces extension package contains various search spaces from Bischl et al. (2023), Kuehn et al. (2018) and Binder, Pfisterer, and Bischl (2020).
The sugar function lts() (learner tuning space) is used to retrieve a TuningSpace
psHow to create a search space to tune cost and gamma
Pass search space to tuning instance
ti(tsk_sonar, lrn("classif.svm", type = "C-classification"), rsmp_cv3,
msr_ce, trm("none"), search_space = search_space)<TuningInstanceBatchSingleCrit>
* State: Not optimized
* Objective: <ObjectiveTuningBatch:classif.svm_on_sonar>
* Search Space:
id class lower upper nlevels
<char> <char> <num> <num> <num>
1: cost ParamDbl 0.1 1e+05 Inf
2: kernel ParamFct NA NA 2
3: shrinking ParamLgl NA NA 2
* Terminator: <TerminatorNone>
Exponentiate cost and add ‘2’ if "polynomial"
search_space = ps(
cost = p_dbl(-1, 1, trafo = function(x) exp(x)),
kernel = p_fct(c("polynomial", "radial")),
.extra_trafo = function(x, param_set) {
if (x$kernel == "polynomial") x$cost = x$cost + 2
x
}
)
search_space$trafo(list(cost = 1, kernel = "radial"))$cost
[1] 2.718282
$kernel
[1] "radial"
$cost
[1] 4.718282
$kernel
[1] "polynomial"
lrn_rpart = lrn("classif.rpart",
minsplit = to_tune(2, 128, logscale = TRUE),
minbucket = to_tune(1, 64, logscale = TRUE),
cp = to_tune(1e-04, 1e-1, logscale = TRUE)
)
at = auto_tuner(
tuner = tnr("random_search", batch_size = 10),
learner = lrn_rpart,
resampling = rsmp("cv", folds = 4),
measure = msr("classif.ce"),
)
rr = resample(tsk("pima"), at, rsmp("cv", folds = 3), store_models = TRUE)Workflows including data preprocessing, building ensemble-models, or more complicated meta-models
PipeOps are the building blocks
PipeOps are connected to form a Graph or pipeline
Short for Pipeline Operator
Includes a $train() and a $predict() method
Has a $param_set field that defines the hyperparameters
Constructed with the po() function
PipeOp includes a $train() and a $predict() method
The po("pca") applies a principal component analysis
tsk_small = tsk("penguins_simple")$select(c("bill_depth", "bill_length"))
poin = list(tsk_small$clone()$filter(1:5))
poout = po_pca$train(poin) # poin: Task in a list
poout # list with a single element 'output'$output
<TaskClassif:penguins> (5 x 3): Simplified Palmer Penguins
* Target: species
* Properties: multiclass
* Features (2):
- dbl (2): PC1, PC2
species PC1 PC2
<fctr> <num> <num>
1: Adelie 0.1561004 0.005716376
2: Adelie 1.2676891 0.789534280
3: Adelie 1.5336113 -0.174460208
The training phase typically generates a particular model of the data, which is saved as the internal $state field
The $state field of po("pca") contains the rotation matrix
This state is then used during predictions and applied to new data
PipeOps represent individual computational steps in machine learning pipelines
These pipelines themselves are defined by Graph objects
A Graph is a collection of PipeOps with “edges” that guide the flow of data
The most convenient way of building a Graph is to connect a sequence of PipeOps using the %>>%-operator
Most common application for mlr3pipelines is to preprocess data before feeding it into a Learner
Learner objects can be converted to PipeOps
To use a Graph as a Learner with an identical interface, it can be wrapped in a GraphLearner object with as_learner()
glrn_sample = as_learner(graph)
glrn_mode = as_learner(po("imputemode") %>>% lrn_logreg)
design = benchmark_grid(tsk("pima"), list(glrn_sample, glrn_mode),
rsmp("cv", folds = 3))
bmr = benchmark(design)
aggr = bmr$aggregate()[, .(learner_id, classif.ce)]
aggr learner_id classif.ce
<char> <num>
1: imputesample.classif.log_reg 0.2330729
2: imputemode.classif.log_reg 0.2330729
PipeOp hyperparameters are collected together in the $param_set of a graph and prefixed with the ID of the PipeOp
graph = po("scale", center = FALSE, scale = TRUE, id = "scale") %>>%
po("scale", center = TRUE, scale = FALSE, id = "center") %>>%
lrn("classif.rpart", cp = 1)
unlist(graph$param_set$values) scale.center scale.scale scale.robust center.center
0 1 0 1
center.scale center.robust classif.rpart.cp classif.rpart.xval
0 0 1 0
Non-sequential pipelines can perform more complex operations
Using the gunion() function, we can instead combine multiple PipeOps, Graphs, or a mixture of both, into a parallel Graph
Many common problems in ML can be well solved by the same pipelines
ppl("bagging", graph) creates a bagging ensemble
ppl("branch", graphs) creates a branch
ppl("robustify") common preprocessing steps
ppl("stacking", base_learners, super_learner) creates a stacking ensemble
po("branch") creates multiple paths such that data can only flow through one of these as determined by the selection hyperparameter
use po("unbranch") (with the same arguments as "branch") to ensure that the outputs are merged into one result object
To demonstrate alternative paths we will make use of the MNIST (LeCun et al. 1998) data, which is useful for demonstrating preprocessing
Do nothing po("nop")
Apply PCA po("pca")
Remove constant features po("removeconstants") then apply the Yeo-Johnson transform po("yeojohnson")
The output of this Graph depends on the setting of the branch.selection hyperparameter
# use the "PCA" path
graph$param_set$values$brnchPO.selection = "pca"
# new PCA columns
head(graph$train(tsk_mnist)[[1]]$feature_names)[1] "PC1" "PC2" "PC3" "PC4" "PC5" "PC6"
# use the "No-Op" path
graph$param_set$values$brnchPO.selection = "nop"
# same features
head(graph$train(tsk_mnist)[[1]]$feature_names)[1] "pixel1" "pixel3" "pixel22" "pixel32" "pixel34" "pixel38"
Branching can even be used to tune which of several learners is most appropriate for a given dataset
Tuning the selection hyperparameters can help determine which of the possible options work best in combination
graph_learner = as_learner(graph_learner)
graph_learner$param_set$set_values(
brnchPO.selection = to_tune(paths),
branch.selection = to_tune(c("classif.rpart", "classif.kknn")),
classif.kknn.k = to_tune(p_int(1, 32,
depends = branch.selection == "classif.kknn"))
)
instance = tune(tnr("grid_search"), tsk_mnist, graph_learner,
rsmp("repeated_cv", folds = 3, repeats = 3), msr("classif.ce"))
instance$archive$data[order(classif.ce)[1:5],
.(brnchPO.selection, classif.kknn.k, branch.selection, classif.ce)]
autoplot(instance)future, mlr3batchmark and rushlgr package